An Overview of Preprocessing of Web Log Files for Web Usage Mining
نویسندگان
چکیده
With the Internet usage gaining popularity and the steady growth of users, the World Wide Web has become a huge repository of data and serves as an important platform for the dissemination of information. The users’ accesses to Web sites are stored in Web server logs. However, the data stored in the log files do not present an accurate picture of the users’ accesses to the Web site. Hence, preprocessing of the Web log data is an essential and pre-requisite phase before it can be used for knowledge-discovery or mining tasks. The preprocessed Web data can then be suitable for the discovery and analysis of useful information referred to as Web mining. Web usage mining, a classification of Web mining, is the application of data mining techniques to discover usage patterns from clickstream and associated data stored in one or more Web servers. This paper presents an overview of the various steps involved in the preprocessing stage.
منابع مشابه
Study on Various Web Mining Functionalities using Web Log Files
As the size of web increases along with number of users, it is very much essential for the website owners to better understand their customers so that they can provide better service, and also enhance the quality of the website. To achieve this they depend on the web access log files. The web access log files can be mined to extract interesting pattern so that the user behavior can be understoo...
متن کاملAn Overview of Web Usage Mining
Web Usage Mining make use of Association Rule Mining to discover the interesting pattern, identify web user behavior, predict web user expectation and improve the business strategy. Association Rule Mining is a technique of Data Mining which is used to find the relationship between the data items. In Web Usage Mining, data are stored in the web server in the form of web log files. Numerous amou...
متن کاملA Survey on Preprocessing Methods for Web Usage Data
World Wide Web is a huge repository of web pages and links. It provides abundance of information for the Internet users. The growth of web is tremendous as approximately one million pages are added daily. Users’ accesses are recorded in web logs. Because of the tremendous usage of web, the web log files are growing at a faster rate and the size is becoming huge. Web data mining is the applicati...
متن کاملData Preprocessing: A Milestone of Web Usage Mining
-.Internet is today full of structured or unstructured information. and this information is directly or indirectly influencing society or peoples. Because today internet is part our daily life activity. But using this abundant and ambiguous in most efficient manner in useful decision making is still a big challenge. During our web surfing either it is online shopping or blogging or using tweets...
متن کاملAn Efficient Algorithm for Data Cleaning of Log File using File Extensions
World Wide Web is a monolithic repository of web pages that provides the Internet users with heaps of information. With the growth in number and complexity of Websites, the size of web has become massively large. Web Usage Mining is a division of web mining that involves application of mining techniques to web server logs in order to extract the behavior of users. A Web Usage Mining process com...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011